Overview

Dataset statistics

Number of variables9
Number of observations4385
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory342.6 KiB
Average record size in memory80.0 B

Variable types

Numeric9

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
FREQUENCIA BOMBA 2 is highly overall correlated with VAZÃO DE RECALQUE - FT03 and 1 other fieldsHigh correlation
NIVEL DO RESERVATÓRIO - LT01 is highly overall correlated with VAZÃO DE RECALQUE - FT03 and 1 other fieldsHigh correlation
VAZÃO DE RECALQUE - FT03 is highly overall correlated with FREQUENCIA BOMBA 1 and 5 other fieldsHigh correlation
PRESSÃO DE SUCÇÃO - PT01 is highly overall correlated with NIVEL DO RESERVATÓRIO - LT01 and 3 other fieldsHigh correlation
PRESSÃO DE RECALQUE - PT02 is highly overall correlated with FREQUENCIA BOMBA 1 and 4 other fieldsHigh correlation
FREQUENCIA BOMBA 1 is highly overall correlated with VAZÃO DE GRAVIDADE - FT02 and 2 other fieldsHigh correlation
VAZÃO DE GRAVIDADE - FT02 is highly overall correlated with FREQUENCIA BOMBA 1 and 3 other fieldsHigh correlation
FREQUENCIA BOMBA 1 has 392 (8.9%) zerosZeros
FREQUENCIA BOMBA 2 has 1136 (25.9%) zerosZeros
FREQUENCIA BOMBA 3 has 3687 (84.1%) zerosZeros
VAZÃO DE ENTRADA- FT01 has 795 (18.1%) zerosZeros
VAZÃO DE GRAVIDADE - FT02 has 272 (6.2%) zerosZeros
PRESSÃO DE RECALQUE - PT02 has 79 (1.8%) zerosZeros

Reproduction

Analysis started2022-11-28 22:28:26.435437
Analysis finished2022-11-28 22:28:46.848723
Duration20.41 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

FREQUENCIA BOMBA 1
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1196
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.939888
Minimum0
Maximum59.988281
Zeros392
Zeros (%)8.9%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:47.035600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q157.842842
median57.988792
Q357.988792
95-th percentile58.317091
Maximum59.988281
Range59.988281
Interquartile range (IQR)0.14595032

Descriptive statistics

Standard deviation17.142205
Coefficient of variation (CV)0.33003932
Kurtosis5.1385932
Mean51.939888
Median Absolute Deviation (MAD)0
Skewness-2.6473655
Sum227756.41
Variance293.8552
MonotonicityNot monotonic
2022-11-28T19:28:47.211501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57.98879242 2632
60.0%
0 392
 
8.9%
59.98828125 45
 
1.0%
58.01076508 13
 
0.3%
49.99084473 13
 
0.3%
58.05471039 12
 
0.3%
44.99212646 11
 
0.3%
29.99230957 10
 
0.2%
58.12062836 9
 
0.2%
34.99102783 9
 
0.2%
Other values (1186) 1239
28.3%
ValueCountFrequency (%)
0 392
8.9%
0.01275072433 1
 
< 0.1%
0.01913495362 1
 
< 0.1%
0.02551918104 1
 
< 0.1%
0.03190341219 1
 
< 0.1%
0.04066173732 1
 
< 0.1%
0.0472676903 1
 
< 0.1%
0.06109762564 1
 
< 0.1%
0.07513397187 1
 
< 0.1%
0.08153351396 1
 
< 0.1%
ValueCountFrequency (%)
59.98828125 45
1.0%
59.98095703 2
 
< 0.1%
59.9793396 1
 
< 0.1%
59.97729492 2
 
< 0.1%
59.97363281 1
 
< 0.1%
59.94200516 1
 
< 0.1%
59.94067383 1
 
< 0.1%
59.92602539 1
 
< 0.1%
59.92236328 2
 
< 0.1%
59.89031601 1
 
< 0.1%

FREQUENCIA BOMBA 2
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct3190
Distinct (%)72.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.81337
Minimum0
Maximum59.991943
Zeros1136
Zeros (%)25.9%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:47.393397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median34.908695
Q338.02359
95-th percentile49.226396
Maximum59.991943
Range59.991943
Interquartile range (IQR)38.02359

Descriptive statistics

Standard deviation17.608565
Coefficient of variation (CV)0.63309714
Kurtosis-0.92343398
Mean27.81337
Median Absolute Deviation (MAD)4.5594673
Skewness-0.71795914
Sum121961.63
Variance310.06157
MonotonicityNot monotonic
2022-11-28T19:28:47.562299image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1136
 
25.9%
29.99597168 12
 
0.3%
39.99707031 10
 
0.2%
57.99245453 8
 
0.2%
59.99194336 7
 
0.2%
44.99578857 6
 
0.1%
34.99468994 6
 
0.1%
24.99725342 3
 
0.1%
36.11528015 3
 
0.1%
49.99450684 3
 
0.1%
Other values (3180) 3191
72.8%
ValueCountFrequency (%)
0 1136
25.9%
0.0004281122528 1
 
< 0.1%
0.0005553666269 1
 
< 0.1%
0.001183174783 1
 
< 0.1%
0.003486369271 1
 
< 0.1%
0.003600863041 1
 
< 0.1%
0.008166855201 1
 
< 0.1%
0.01569584385 1
 
< 0.1%
0.02166323923 1
 
< 0.1%
0.02376453392 1
 
< 0.1%
ValueCountFrequency (%)
59.99194336 7
0.2%
59.98828125 3
0.1%
59.98324203 1
 
< 0.1%
59.98095703 1
 
< 0.1%
59.97058105 1
 
< 0.1%
59.96425629 1
 
< 0.1%
59.96066284 1
 
< 0.1%
59.95888519 1
 
< 0.1%
59.95774841 1
 
< 0.1%
59.95323944 1
 
< 0.1%

FREQUENCIA BOMBA 3
Real number (ℝ)

Distinct638
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4058755
Minimum0
Maximum59.988281
Zeros3687
Zeros (%)84.1%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:47.751398image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile50.017325
Maximum59.988281
Range59.988281
Interquartile range (IQR)0

Descriptive statistics

Standard deviation16.765124
Coefficient of variation (CV)2.6171479
Kurtosis3.2320295
Mean6.4058755
Median Absolute Deviation (MAD)0
Skewness2.2674761
Sum28089.764
Variance281.06937
MonotonicityNot monotonic
2022-11-28T19:28:47.927271image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3687
84.1%
57.98879242 36
 
0.8%
0.1318343282 10
 
0.2%
59.98828125 7
 
0.2%
39.9934082 5
 
0.1%
54.98956299 4
 
0.1%
29.99230957 3
 
0.1%
0.1245101988 2
 
< 0.1%
34.99102783 2
 
< 0.1%
0.1272614598 1
 
< 0.1%
Other values (628) 628
 
14.3%
ValueCountFrequency (%)
0 3687
84.1%
0.0001107446224 1
 
< 0.1%
0.001534604002 1
 
< 0.1%
0.005035049282 1
 
< 0.1%
0.006258170586 1
 
< 0.1%
0.007479577791 1
 
< 0.1%
0.009604785591 1
 
< 0.1%
0.0099241063 1
 
< 0.1%
0.01236863434 1
 
< 0.1%
0.01295140106 1
 
< 0.1%
ValueCountFrequency (%)
59.98828125 7
0.2%
59.96926498 1
 
< 0.1%
59.95925522 1
 
< 0.1%
59.94100571 1
 
< 0.1%
59.87623978 1
 
< 0.1%
59.86713791 1
 
< 0.1%
59.86031342 1
 
< 0.1%
59.85322952 1
 
< 0.1%
59.85174561 1
 
< 0.1%
59.83979797 1
 
< 0.1%
Distinct4372
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2355002
Minimum0.29407585
Maximum4.4049139
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:48.300058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.29407585
5-th percentile1.9517602
Q12.7917631
median3.30353
Q33.7749107
95-th percentile4.2591256
Maximum4.4049139
Range4.1108381
Interquartile range (IQR)0.98314762

Descriptive statistics

Standard deviation0.69706843
Coefficient of variation (CV)0.21544379
Kurtosis-0.032666725
Mean3.2355002
Median Absolute Deviation (MAD)0.4901371
Skewness-0.55811943
Sum14187.668
Variance0.48590439
MonotonicityNot monotonic
2022-11-28T19:28:48.462964image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 4
 
0.1%
4.300000191 4
 
0.1%
3.920138836 2
 
< 0.1%
4.047839642 2
 
< 0.1%
3.236786366 2
 
< 0.1%
4.295138836 2
 
< 0.1%
3.432870388 2
 
< 0.1%
3.836458206 2
 
< 0.1%
3.001000643 2
 
< 0.1%
3.768044233 1
 
< 0.1%
Other values (4362) 4362
99.5%
ValueCountFrequency (%)
0.2940758467 1
< 0.1%
0.3723406196 1
< 0.1%
0.4662296474 1
< 0.1%
0.4815826118 1
< 0.1%
0.8559085727 1
< 0.1%
0.8737350106 1
< 0.1%
0.8987794518 1
< 0.1%
0.9570472836 1
< 0.1%
0.9613910913 1
< 0.1%
0.9710686803 1
< 0.1%
ValueCountFrequency (%)
4.404913902 1
< 0.1%
4.403257847 1
< 0.1%
4.401571751 1
< 0.1%
4.401538372 1
< 0.1%
4.401222229 1
< 0.1%
4.400625229 1
< 0.1%
4.398981094 1
< 0.1%
4.398623466 1
< 0.1%
4.397883892 1
< 0.1%
4.39741993 1
< 0.1%

VAZÃO DE ENTRADA- FT01
Real number (ℝ)

Distinct1922
Distinct (%)43.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.66294
Minimum0
Maximum383.87036
Zeros795
Zeros (%)18.1%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:48.633001image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.11574074
median0.11574074
Q3264.27155
95-th percentile280.08785
Maximum383.87036
Range383.87036
Interquartile range (IQR)264.1558

Descriptive statistics

Standard deviation132.60141
Coefficient of variation (CV)1.1769746
Kurtosis-1.8552865
Mean112.66294
Median Absolute Deviation (MAD)0.11574074
Skewness0.34275668
Sum494026.98
Variance17583.135
MonotonicityNot monotonic
2022-11-28T19:28:48.894874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1157407388 1652
37.7%
0 795
18.1%
0.2314814776 4
 
0.1%
1.273148179 3
 
0.1%
276.5046387 2
 
< 0.1%
261.458313 2
 
< 0.1%
264.6990662 2
 
< 0.1%
276.2731628 2
 
< 0.1%
263.541687 2
 
< 0.1%
0.3472222388 2
 
< 0.1%
Other values (1912) 1919
43.8%
ValueCountFrequency (%)
0 795
18.1%
0.04144435376 1
 
< 0.1%
0.0578703694 1
 
< 0.1%
0.1157407388 1652
37.7%
0.1275791973 1
 
< 0.1%
0.1383209676 1
 
< 0.1%
0.1736586094 1
 
< 0.1%
0.176884532 1
 
< 0.1%
0.1776106358 1
 
< 0.1%
0.1835666746 1
 
< 0.1%
ValueCountFrequency (%)
383.8703613 1
< 0.1%
381.5904236 1
< 0.1%
376.8986206 1
< 0.1%
374.4212952 1
< 0.1%
370.3518372 1
< 0.1%
367.4073792 1
< 0.1%
366.7018738 1
< 0.1%
366.4682617 1
< 0.1%
365.7449341 1
< 0.1%
364.6219177 1
< 0.1%

VAZÃO DE GRAVIDADE - FT02
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct4114
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.94359
Minimum0
Maximum326.1713
Zeros272
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:49.080781image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1123.97898
median136.00012
Q3148.20116
95-th percentile179.92901
Maximum326.1713
Range326.1713
Interquartile range (IQR)24.222176

Descriptive statistics

Standard deviation44.78165
Coefficient of variation (CV)0.336847
Kurtosis4.9098637
Mean132.94359
Median Absolute Deviation (MAD)12.126892
Skewness-0.67329216
Sum582957.65
Variance2005.3962
MonotonicityNot monotonic
2022-11-28T19:28:49.252474image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 272
 
6.2%
108.8570862 1
 
< 0.1%
141.1200867 1
 
< 0.1%
140.4069519 1
 
< 0.1%
128.144989 1
 
< 0.1%
125.5947037 1
 
< 0.1%
153.2140656 1
 
< 0.1%
143.0213776 1
 
< 0.1%
144.0009308 1
 
< 0.1%
146.4695892 1
 
< 0.1%
Other values (4104) 4104
93.6%
ValueCountFrequency (%)
0 272
6.2%
27.51053429 1
 
< 0.1%
30.12467003 1
 
< 0.1%
30.16630745 1
 
< 0.1%
30.91310501 1
 
< 0.1%
31.05513191 1
 
< 0.1%
41.97084808 1
 
< 0.1%
56.65369415 1
 
< 0.1%
56.89728165 1
 
< 0.1%
57.67729187 1
 
< 0.1%
ValueCountFrequency (%)
326.1712952 1
< 0.1%
324.9286499 1
< 0.1%
322.9801636 1
< 0.1%
320.3776245 1
< 0.1%
304.5761719 1
< 0.1%
302.353302 1
< 0.1%
302.0870361 1
< 0.1%
301.3987427 1
< 0.1%
300.1445923 1
< 0.1%
299.913208 1
< 0.1%

VAZÃO DE RECALQUE - FT03
Real number (ℝ)

Distinct4114
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.40697
Minimum0
Maximum194.35185
Zeros24
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:49.704202image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.028935185
Q1111.65463
median118.82233
Q3125.62949
95-th percentile136.53865
Maximum194.35185
Range194.35185
Interquartile range (IQR)13.974861

Descriptive statistics

Standard deviation31.328318
Coefficient of variation (CV)0.27870442
Kurtosis7.2635604
Mean112.40697
Median Absolute Deviation (MAD)6.9732361
Skewness-2.6536729
Sum492904.54
Variance981.46349
MonotonicityNot monotonic
2022-11-28T19:28:49.897089image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0289351847 226
 
5.2%
0 24
 
0.5%
117.8530121 3
 
0.1%
119.1956024 2
 
< 0.1%
132.5810242 2
 
< 0.1%
113.6284714 2
 
< 0.1%
114.9884186 2
 
< 0.1%
110.1273193 2
 
< 0.1%
132.378479 2
 
< 0.1%
117.2742996 2
 
< 0.1%
Other values (4104) 4118
93.9%
ValueCountFrequency (%)
0 24
 
0.5%
0.01446759235 1
 
< 0.1%
0.0289351847 226
5.2%
0.2875666618 1
 
< 0.1%
0.3858614862 1
 
< 0.1%
0.4217657745 1
 
< 0.1%
0.5559648871 1
 
< 0.1%
0.6147419214 1
 
< 0.1%
0.69016397 1
 
< 0.1%
0.8436223269 1
 
< 0.1%
ValueCountFrequency (%)
194.3518524 1
< 0.1%
189.3903809 1
< 0.1%
188.7152863 1
< 0.1%
185.4285889 1
< 0.1%
183.7471924 1
< 0.1%
183.2344971 1
< 0.1%
181.8721008 1
< 0.1%
180.8506927 1
< 0.1%
179.4629211 1
< 0.1%
177.5573273 1
< 0.1%
Distinct4364
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1077789
Minimum0.87751222
Maximum5.6827645
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:50.084002image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.87751222
5-th percentile2.7513295
Q13.6190748
median4.147234
Q34.6634259
95-th percentile5.2179035
Maximum5.6827645
Range4.8052523
Interquartile range (IQR)1.0443511

Descriptive statistics

Standard deviation0.76275458
Coefficient of variation (CV)0.1856854
Kurtosis0.24480127
Mean4.1077789
Median Absolute Deviation (MAD)0.52290487
Skewness-0.41059959
Sum18012.61
Variance0.58179454
MonotonicityNot monotonic
2022-11-28T19:28:50.257604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.541088104 7
 
0.2%
5.532407284 6
 
0.1%
5.636573792 4
 
0.1%
5.55555582 3
 
0.1%
5.53819418 2
 
< 0.1%
4.202215195 2
 
< 0.1%
4.284318924 2
 
< 0.1%
5.182291985 2
 
< 0.1%
4.845046043 2
 
< 0.1%
5.092852592 1
 
< 0.1%
Other values (4354) 4354
99.3%
ValueCountFrequency (%)
0.8775122166 1
< 0.1%
0.8825973868 1
< 0.1%
0.8876825571 1
< 0.1%
0.8906169534 1
< 0.1%
0.892767787 1
< 0.1%
0.8949127793 1
< 0.1%
0.8992086649 1
< 0.1%
0.9035044909 1
< 0.1%
1.519142866 1
< 0.1%
1.598580122 1
< 0.1%
ValueCountFrequency (%)
5.68276453 1
< 0.1%
5.668966293 1
< 0.1%
5.668751717 1
< 0.1%
5.666208267 1
< 0.1%
5.665445328 1
< 0.1%
5.663450718 1
< 0.1%
5.661117077 1
< 0.1%
5.660072803 1
< 0.1%
5.656788349 1
< 0.1%
5.652602673 1
< 0.1%

PRESSÃO DE RECALQUE - PT02
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1992
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.794268
Minimum0
Maximum28.084936
Zeros79
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-11-28T19:28:50.434503image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.082886364
Q121.721619
median22.048611
Q323.017941
95-th percentile25.954861
Maximum28.084936
Range28.084936
Interquartile range (IQR)1.2963219

Descriptive statistics

Standard deviation6.1425533
Coefficient of variation (CV)0.29539647
Kurtosis6.1429142
Mean20.794268
Median Absolute Deviation (MAD)0.96932983
Skewness-2.6683162
Sum91182.864
Variance37.730961
MonotonicityNot monotonic
2022-11-28T19:28:50.613401image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22.00520706 216
 
4.9%
23.00347328 183
 
4.2%
22.9745369 155
 
3.5%
23.01794052 145
 
3.3%
22.01967621 145
 
3.3%
21.97627258 140
 
3.2%
22.98900414 103
 
2.3%
22.04861069 102
 
2.3%
23.046875 87
 
2.0%
0 79
 
1.8%
Other values (1982) 3030
69.1%
ValueCountFrequency (%)
0 79
1.8%
0.002456213813 1
 
< 0.1%
0.01011194568 1
 
< 0.1%
0.01062385458 1
 
< 0.1%
0.01230416168 1
 
< 0.1%
0.01609536819 1
 
< 0.1%
0.01681616344 1
 
< 0.1%
0.0184647534 1
 
< 0.1%
0.01964788325 1
 
< 0.1%
0.02034074813 1
 
< 0.1%
ValueCountFrequency (%)
28.08493614 1
 
< 0.1%
28.05792236 1
 
< 0.1%
28.04745102 1
 
< 0.1%
28.04636955 1
 
< 0.1%
28.03819466 2
 
< 0.1%
28.02372551 5
0.1%
28.01240349 1
 
< 0.1%
28.00926018 5
0.1%
28.00657845 1
 
< 0.1%
28.00614929 1
 
< 0.1%

Interactions

2022-11-28T19:28:45.019979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.126851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.602831image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.220913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.653146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.020912image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.508911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:42.162186image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.652784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.175889image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.332139image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.761861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.373852image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.807053image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.185840image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.670819image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:42.337068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.802700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.330801image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.489048image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.928927image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.537892image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.955953image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.353723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.834746image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:42.499995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.949659image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.482732image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.643872image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:35.086847image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.690866image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.107864image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.514829image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.995893image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:42.666191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.100860image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.630677image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.803312image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:35.425858image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.842672image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.256800image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.675735image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:41.143790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:42.828098image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.239782image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.802409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:33.975197image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:35.589878image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.018571image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.417278image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:39.850308image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:41.313690image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.396606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:45.961318image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.135106image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:35.751886image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.176481image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.565174image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.014194image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:41.482899image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.161906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.558521image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:46.136219image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.294940image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:35.914865image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.345363image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.730082image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.191094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:41.646805image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.329829image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.721428image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:46.280134image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:34.453917image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:36.069793image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:37.498234image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:38.878018image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:40.351006image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:41.807659image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:43.489742image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-28T19:28:44.871873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-28T19:28:50.778389image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-11-28T19:28:51.032242image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-28T19:28:51.275203image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-28T19:28:51.529060image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-28T19:28:51.769939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-28T19:28:46.482039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-28T19:28:46.715959image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02
Timestamp
2018-01-01 18:00:0049.4040.0000.0003.7470.000108.85787.4494.80916.963
2018-01-01 19:00:0052.1550.0000.0003.797280.593110.51494.7784.87318.047
2018-01-01 20:00:0051.2840.0000.0003.918279.349109.42391.4485.00217.925
2018-01-01 21:00:0050.2100.0000.0004.035276.239116.94891.8375.11617.016
2018-01-02 18:00:0056.7690.0000.0004.083279.826128.389105.9585.07720.014
2018-01-02 19:00:0057.2720.0000.0004.1080.000125.107105.2985.07920.961
2018-01-02 20:00:0056.6360.0000.0003.7300.000127.512103.8574.70420.048
2018-01-02 21:00:0057.4630.0000.0003.3520.000126.797106.3534.31120.047
2018-01-03 18:00:0057.9360.0000.0004.263269.974131.095110.5805.21820.043
2018-01-03 19:00:0059.1620.0000.0004.2540.000129.145112.7515.17621.073
FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02
Timestamp
2020-12-29 20:00:0057.70845.8280.0003.8100.116140.620115.8674.68922.989
2020-12-29 21:00:0057.68445.0270.0003.4030.116132.692108.4864.34222.020
2020-12-30 18:00:0057.98946.4210.0003.6680.116149.632118.4904.52523.018
2020-12-30 19:00:0057.98947.1730.0003.2350.116149.119120.3894.07223.018
2020-12-30 20:00:0057.98947.5560.0002.8060.116140.063117.8853.66023.018
2020-12-30 21:00:0057.98946.1980.0002.487274.195132.376108.8833.38322.066
2020-12-31 18:00:0057.98945.8580.0004.1220.116154.669118.0994.97623.047
2020-12-31 19:00:0057.98946.7980.0003.6700.116149.437121.8154.49923.047
2020-12-31 20:00:0057.98947.0610.0003.2210.116151.018117.9334.07123.047
2020-12-31 21:00:000.00045.87057.9892.8030.116129.294106.6933.71421.976

Duplicate rows

Most frequently occurring

FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02# duplicates
00.0000.0000.0003.9200.1160.0000.0295.1820.0002
10.0000.0000.0004.2950.1160.0000.0295.5320.0002